REMARKS 

In the Office Action, currently-pending Claims 1-84 are rejected under 35 
U.S.C. §1 02(e) over the reference of Naqavasu , etal., U.S. Patent Application 
Number 2003/01 18197 and under §102(b) over Hauser , German Patent Number 
DT2628259. Both of those references are cited in §1 02(b) rejections. However, 
the Examiner does not give any specific analysis of the claims. Rather, the 
claims are rejected as being anticipated based upon the International Search 
Report dated November 8, 2004 from the corresponding PCT Application. 
Furthermore, the Examiner objected to a specific citation in the Information 
Disclosure Statement, 

Information Disclosure Statement 

The Examiner indicates that the patents listed on the Search Report were 
considered, but they will not be listed on the Patent resulting from the Application 
because they were not provided in a separate list in compliance with 37 C.F.R. 
1.98(a)(1). However, Applicants note that in the Supplemental Information 
Disclosure Statement filed August 23, 2004, wherein the International Search 
Report was listed, all of the individual references set forth in the Search Report 
were also individually listed in the PTO 1449 Forms filed with the Information 
Disclosure Statement. Furthermore, that 1449 Form listing those references was 
returned initialed by the Examiner, and thus, those individual references should 
be listed on the patent. 
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Applicants further note that an earlier IDS was filed on February 8, 2004. 
However, in the most recent Office Action, the Examiner did not return the initial 
forms associated with that earlier IDS. Applicants respectfully request 
consideration of those documents and return of the initialled form indicating their 
consideration by the Examiner. 

Section 102 Rejections . 

As noted above, the Examiner has rejected the pending Claims 1-84 
based upon the references of Nagayasu, et al.and Hauser , noting that the 
rejection is based on the same reasons as set forth in the International Search 
Report. However, the International Search Report was a cursory analysis of 
the cited art, and its applicability to the pending claims. In the further analysis 
of the cited art and the pending claims, it becomes very clear that the cited 
references do not at all teach the limitations that are set forth in the claims, and 
thus, those references cannot anticipate those claims under §1 02(e) or 
§1 02(b). 

The present invention is directed to a headset, system, and method for 
selectively transmitting captured audio signals based on a preliminary 
determination that those audio signals contain user speech. Representations 
of the captured audio signals are selectively transmitted for further speech 
recognition processing. If user speech is not detected, the representations of 
the captured audio signals are not forwarded to another device, such as a 
device having speech recognition capabilities. In that way, needless 
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transmission of captured sound is avoided, thus prolonging the battery life of 
the headset as well as avoiding constant RF transmission from the head of a 
user to the remote device. Neither of the cited references of Naqavasu , et al. 
nor Hauser teaches a device having such capabilities, and neither of the 
references is concerned with needless sound transmission or constant RF 
transmission from the head of a user. In fact, both Naqavasu , et al. and Hauser 
teach away from the invention because they teach a constant transmission of 
captured sound from a headset. Furthermore, neither of those references 
teaches an initial or preliminary evaluation of captured audio signals to 
determine if speech is present, and then a selective transmission of the 
captured audio signals, based on the determination that user speech is 
detected in those audio signals. 

Various of the claims have been amended to further clarify those claims. 
Several of the claims have been cancelled. Based upon any reasonable 
interpretation of the cited references of Naqavasu , et al. and Hauser , those 
references clearly do not teach all the elements cited in the current claims. 

Turning to the Naqavasu , et al. reference, the Examiner cites three 
Naqavasu , et al. references in the PTO 892 Form attached with the Office 
Action. Those include U.S. Application Number 2003/0118197 and U.S. Patent 
No. 7,1 1 0,800, which is the issued Patent of the ' 1 97 Application. The third 
reference, 2005/0232436, is essentially a divisional of the '197 Application. 
Accordingly, all those references are essentially the same disclosure, and will 
be referred to as the Naqavasu , et al. application herein. 
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The Nagayasu , et al, application discloses a headset device which 
processes sound signals, including radio signals that include speech signals. 
Furthermore, one embodiment of the Nagayasu , et al. device has some speech 
recognition capabilities. However, despite the Nagayasu , et al. reference 
teaching some speech processing and also some speech recognition, that 
reference still does not teach the invention as recited in the pending claims. 

Generally, the Nagayasu , et al. reference discloses a headset device that 
receives both speaking sounds, such as through short-range radio 
communications through a microphone, and also external sounds that are picked 
up by a second, sound detection microphone. The external sounds may contain 
some of the speech from the radio communication. The headset in the 
Nagayasu , et al. reference receives both the radio sounds and the external 
sounds and is configured for selectively adjusting the ratio of external sounds 
and the radio signals for use by the headset. That is, the Nagayasu , et al. 
reference teaches a headset that allows the user to adjust the ratio of speech to 
direct external sounds that it receives so that the wearer can emphasize or hear 
one of those sounds more than the other. 

However, the Nagayasu , et al. headset makes no determination with 
respect to the actual content coming from either source (i.e. radio or external 
microphone) in order to determine if speech is actually in one of the signals. The 
Nagayasu , et al. reference assumes that some of the radio signals are speech 
radio signals, because they are coming from another speaker wearing a headset 
who is transmitting by radio to a second headset. However, the noises picked up 
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by the speaker's microphone and transmitted by radio might also be external 
noises that exist at the location of that speaker. The Naqavasu , et al., headset 
does not know the specific contents or information of either of the two signals, 
and, in fact, does not care. It transmits on the radio channel regardless of the 
content of the captured sounds. The radio communication is essentially 
continuous and is not at all selective based on the content of the captured audio 
signals that are sent over the radio link. Furthermore, the Naqavasu , et al. 
headset does not form sampled representations of the audio signals so that 
speech detection circuitry might initially be utilized with the sampled 
representations of the audio signals to determine that the audio signals include 
user's speech. Rather, the headset radios of Naqavasu , et al. just send the raw 
audio to the receiving radio to be played through a speaker in the traditional 
sense. Again, the Naqavasu , et al. headphone does not care what the noises 
are, it merely adjusts the ratio between two different noise or sound sources for a 
listener to hear one more than the other. Additionally, there is no teaching of 
selective transmission of the signals to a device for further speech recognition 
based on the preliminary speech detection analysis at the headset. 

Accordingly, the Naqavasu , et al. reference does not in any way teach a 
headset that processes sampled representations of audio signals that are 
captured by the headset and uses speech detection circuitry to preliminarily 
determine that the audio signals do indeed include user's speech. 
Furthermore, the headset as set forth in the Naqavasu , et al. patent is 
constantly transmitting. It does not selectively transmit based on the 
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determination that user's speech is detected. Again, in the Nagavasu , et al. 
reference, the headset does not care about the content of the sounds and 
certainly makes no determination based upon the content of the audio signals 
in order to make a decision to transmit or not transmit those audio signals. 

Even in the embodiment of the headset set forth in Nagavasu , et al. that 
has speech recognition capabilities, speech recognition is completed at one 
location (i.e., the headset), and only the actual end results, not sampled 
representations of the audio, are sent. In the present invention, the headset 
provides an initial speech detection to determine whether any further 
processing is necessary, and whether representations of the captured audio 
signal should be sent selectively to another device for completion of the speech 
processing or the back-end processing. 

In the Nagavasu , et al. device, such an embodiment would require 
complete speech processing capability on a single headset. As noted in the 
background of the present Application, such a solution would not be currently 
practical and does not address power issues associated with the portable 
headset of the invention. As noted, the present invention provides a means for 
those workers in a voice-driven environment to receive voice instructions, ask 
questions, report the progress of their tasks, report working conditions, and 
otherwise collect data associated with their voice-driven work. As such, a 
headset must be constantly worn on the head throughout a work shift. 
Therefore, a heavy battery would be necessary for powering a full speech 
recognizer and voice synthesizer and such a configuration on a headset would 
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be impractical. Furthermore, in any operation of the Naqayasu , et al. headset, 
there is no selective transmission of representations of the captured audio 
signals or selective reduction of data transferred by the headset due to an initial 
or preliminary detection of speech signals within the captured audio signals. 
Accordingly, the Naqayasu , et al. reference does not anticipate the invention 
recited in the pending claims. 

With respect to Hauser , we obtained an English translation of the Detailed 
Description portion of that reference (which is enclosed), and that reference also 
does not teach or render obvious the claimed invention. 

Primarily, the intercom system of the Hauser reference is directed to 
conveying speech signals between speakers when there is a lot of background 
environmental noise, such as between a motorcycle driver and passenger. 
Specifically, the intercom system utilizes a diplexer which frequency divides, or 
frequency separates those signals that might be considered useful, such as in 
the typical speech band (e.g. 300 Hz - 3000 Hz), from out of the band noise. 
Therefore, based on frequency, the Hauser intercom system makes a 
determination that those signals between 300 Hz and 3000 Hz might be userful, 
and all other signals outside of that band might be noise. However, the intercom 
system of the Hauser reference only divides signals by frequency. It is not 
concerned with the information or the form of the signals in that frequency band. 
The Hauser device does not in any way process sampled representations of 
those audio signals using speech detection circuitry, in order to determine that 
the audio signals in one separate frequency band actually include user's speech. 
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The Hauser intercom system utilizes a diplexer to make the initial frequency 
division of signals. Therefore, it captures signals in the 300 Hz - 3000 Hz range 
and transmits those signals regardless of any content. The Hauser reference 
also realizes that additional noise may fall within the 300 Hz - 3000 Hz band. 
Since the Hauser intercom system does not have the ability to process and 
analyze the sampled representations of the audio system to detect speech, it 
must accept the noise that falls within the designated speaking band and transmit 
or send all signals, including noise, falling in the specific frequency band. 

Furthermore, the Hauser intercom system is generally always transmitting 
the signals falling in the specific frequency band, and thus, does not selectively 
transmit sampled representations of the captured audio signals based on the 
determination that user's speech is detected as set forth in the currently-pending 
claims. If sounds are detected in the 300 Hz - 3000 Hz band, they will be 
transmitted, whatever levels of sound they are, even if they do not contain 
speech. Furthermore, what is constantly transmitted or sent in Hauser are raw 
audio signals for replay at a speaker. There is no processing of the audio, and 
there is no transmission of sampled representations of the audio signals taught in 
Hauser . Again, the Hauser reference is simply doing frequency segregation and 
is not processing sampled representations of the audio signals using speech 
detection circuitry in order to preliminarily determine that the audio signals 
actually include user's speech or selectively transmitting such representations 
based on that determination. 
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As such, the present invention is not taught or anticipated or rendered 
obvious by the Hauser reference. 

Turning now to Claim 1 , that claim recites a headset configured for 
processing sampled representations of audio signals captured by the headset, 
and using speech detection circuitry to determine that the audio signals include 
user speech. That is, a preliminary detection is made to determine whether 
those audio signals are merely non-useful sounds, or actual user speech from a 
user wearing the headset that should be further processed. Furthermore, Claim 
1 recites that the headset is further configured for selectively transmitting 
sampled representations of the captured audio signals to a device based on the 
determination that user speech is actually detected in those audio signals. As 
noted above, the Naqavasu , et al. reference does not at all teach a headset with 
speech detection circuitry to determine that audio signals include user speech, 
and then to selectively transmit (or not transmit) representations of those 
captured audio signals based on the determination that user speech is detected. 
In the embodiments disclosed in Naqavasu , et al. the headset does not care 
what the content of the various audio signals is, rather it is just concerned with 
the ratio of those sound signals received by one source, with respect to the 
sound signals received by another source. Furthermore, even in the 
embodiment of Naqavasu , et al. discussing speech recognition, only the data 
results of the speech recognition are transmitted, not sampled representations of 
the captured audio signals for further processing. In fact, there would be no need 
for sampled representations of the captured audio signals in Naqavasu , et al. 
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because the speech recognition would already be complete. In the present 
invention, additional back-end speech processing is provided by another device, 
and thus, that device must receive sampled audio signals to process. 
Furthermore, even with the speech recognition capability of Naqayasu , et al., that 
headset is constantly transmitting something. There is no selective transmitting 
of sampled representations of the captured audio signals based on the 
determination that user speech is detected or not detected. 

As noted above, the Hauser reference does not teach a headset 
configured to process sampled representations of audio signals using speech 
detection circuitry to determine that the audio signals include user speech. 
Furthermore, Hauser does not teach a headset that selectively transmits the 
sampled representations of the captured audio signals based on that 
determination that is made regarding the content of speech. 

Accordingly, Independent Claim 1 cannot be anticipated by the 
Naqayasu , et al. or the Hauser reference under §1 02 because neither of those 
references teaches all of the specific elements recited in that claim as required 
under §102. As such, Claim 1 is allowable over that cited art. 

Of the Dependent Claims 2-17, which depend from Claim 1, Claims 2-3, 
6, and 1 6-1 7 are cancelled. The other dependent claims are allowable for the 
reasons cited above with respect to Claim 1 and are further allowable because 
each of those claims recites a unique combination of elements, which is not 
taught by either Naqayasu , et al. or Hauser . As such, those pending 
dependent are allowable as well. 
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Independent Claim 18 recites a headset with some similar limitations, as 
noted above, with respect to Claim 1 . Specifically, the headset includes a 
microphone for receiving audio signals and also includes processing circuitry 
that is configured for analyzing sampled representations of the audio signals to 
detect if the sampled representations include user speech. Furthermore, Claim 
18 recites circuitry configured for selectively transmitting the sampled 
representations of the audio signals to a device when user speech is detected, 
and generally not transmitting to a device when user speech is not detected. 

As noted above, neither Naqavasu , et al. nor Hauser teaches a headset 
that analyzes sampled representations of audio signals to detect if they include 
user speech, and then selectively transmits those sampled representations 
when user speech is detected, and generally does not transmit when user 
speech is not detected. Again, the Naqavasu , et al. reference is primarily 
concerned with varying the ratio between two sound sources, such as external 
sounds and radio sounds to vary what a user hears through the headset. 
Hauser , on the other hand, merely frequency segregates signals, but still 
transmits all the raw audio in the frequency band, regardless of its content. As 
such, there is absolutely no teaching of the elements recited in Claim 18 
directed to analyzing sampled representations of audio signals to determine if 
they include user speech and then selectively transmitting those sampled 
representations of the audio signals if user speech is detected and not 
transmitting and user speech is not detected. Even in the Naqavasu , et al. 
embodiment that utilizes speech recognition, the signals are essentially 
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processed through the speech recognition process at the headset. There is no 
initial analysis of sampled representations, nor is there a selective transmission 
based upon that initial analysis so that sampled representations would be 
transmitted when user speech is detected, but generally not transmitted when 
user speech is not detected. Accordingly, because the cited references do not 
teach all the elements recited in Claim 18, Claim 18 would not be anticipated 
under § 102 by the cited art, and thus, would be allowable. Depending Claims 
19-27 depend from Claim 18, and thus, include all the limitations therein. 
Accordingly, those claims would also be in an allowable form for the reasons 
noted above. Those claims are further allowable because they recite a unique 
combination of elements not taught by the cited art. 
Claim 28 has been canceled. 

Claim 29 is an independent claim that recites a system for wireless 
communications that comprises a device configured for processing speech 
signals and a headset for capturing the audio signal to be processed. Claim 29 
recites that the headset is configured for initially processing sampled 
representations of the captured audio signals using speech detection circuitry 
to determine if the audio signals include user's speech. Claim 29 also recites 
that the headset selectively wirelessly transmits, to the device, the sampled 
representations of the captured audio signals based on the determination that 
speech has been detected. For similar reasons as noted above with respect to 
Claims 1 and 1 8, cited references in Naqavasu , et al. or Hauser do not teach all 
the elements recited in Claim 29. For example, those references do not teach 
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a headset that initially processes sampled representations of captured audio 
signals to determine that those audio signals do indeed include user speech. 
Nor do those references teach selective wireless transmission of the sampled 
representations based on that determination. Rather, as noted above, the two 
references do not make any discrimination with respect to transmitting signals, 
but merely are primarily concerned with either varying the ratio between two 
noise sources, or simply making a frequency segregation of signals to try to 
eliminate some noise. Accordingly, Claim 29 is not anticipated by the cited 
references under §102, and is allowable over the cited art. Claims 30-44 are 
independent claims, which depend from Claim 29. Claim 34 has been 
cancelled. The remaining claims each recite the limitations set forth in Claim 
29, and thus, would be allowable for that reason. Additionally, each of those 
claims recites a unique combination of elements and none of those 
combinations are taught by the cited references. Accordingly, those dependent 
claims are also in an allowable form. 

Claim 45 is an independent method claim, including the limitations along 
the lines set forth in Claim 1. Specifically, the method comprises the steps of 
capturing audio signals and processing sampled representations of the audio 
signals using speech detection circuitry to determine if the audio signals include 
user speech. The method of Claim 45 further recites selectively transmitting 
sampled representations of the captured audio signals to a device based on the 
determination that user speech is detected. For the reasons discussed above 
with Claim 1 and other of the independent claims, the cited references of 
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Naqavasu , et al. or Hauser do not teach all of the elements set forth in Claim 
45. Thus, Claim 45 is not anticipated by those references under §102, and is 
allowable. Dependent Claims 46-57 each depend from Claim 45, and thus, 
include the limitations therein, making those claims also allowable over the 
cited art. Furthermore, those claims recite unique methods, which are not 
anticipated by the cited art. Accordingly, those dependent claims are also in an 
allowable form. 

Claim 58 is also an independent claim. That claim recites a headset for 
communication with a remote device. Claim 58 recites a headset with a 
microphone system configured to capture audio signals, including user speech 
and circuitry responsive to the output of the microphone system to detect user 
speech, and configured to reduce the amount of microphone system output 
data communicated to the remote device based on the user speech detection. 

In one aspect of the present invention, the microphone system captures 
audio signals, and thus, relays those signals through the headset to be 
transmitted to a remote device. As noted above, in the cited art of Naqavasu , 
et al. or Hauser , the headsets transmit the signals they receive to other 
headsets without making any sort of analytical evaluation of those signals to 
detect user speech. For example, Naqavasu , et al. essentially deals with 
signals from two sources, and, depending upon a switch, will change the ratio 
of signals that are sent from one of those sources such that one source is the 
primary source. Hauser , on the other hand, merely utilizes a frequency filter, 
such as a diplexer, to divide out signals in a certain frequency range, which 



32 



may be of interest. It does not care about the content of those frequency 
signals, but rather sends everything that fits within the selective frequency 
band. The present invention, on the other hand, makes such a determination to 
see if the audio signals contain user speech that should be further processed. 
It does not merely transmit on those signals, regardless of their content. 
Rather, it will utilize the preliminary speech detection processing in order to 
make an evaluation, and, if the speech detection does not detect user speech, 
the headset does not transmit those captured audio signals. This evaluation 
thus, reduces the amount of microphone system output data that is 
communicated to the remote device. The other references, on the other hand, 
essentially send whatever is received by the selected microphone. 
Accordingly, the references in Naqavasu , et al. or Hauser do not teach all of the 
elements recited in Claim 58, and thus, do not anticipate Claim 58 under §102. 
Claim 58 is thus, allowable. Dependent Claims 59-71 each depend from Claim 
58. Claim 66 has been cancelled. Accordingly, each of those dependent 
claims is also in an allowable form for the reasons noted above. Furthermore, 
each of those dependent claims is further allowable because it recites a unique 
combination of elements note anticipated by the cited references. 

Claim 72 is also an independent claim and recites a headset for 
communication with a remote device that is capable of speech recognition 
processing. However, as noted above, the claimed headset does not 
automatically forward all captured audio signals for such further speech 
processing by a remote device. Rather, Claim 72 recites that the headset is 



33 



configured to sample the audio signals captured by the headset to make an 
initial detection of whether the captured audio signals include user speech. 
Furthermore, Claim 72 recites that the headset is operable to transmit to the 
device sampled representations of the captured audio signals for further 
speech recognition processing only when user speech is detected. Again, the 
non-discriminatory headsets of the cited art of Naqayasu , et al. or Hauser do 
not make an initial detection of whether the captured audio signals include user 
speech. Nor do they make a selective transmission of sample representations 
of the captured audio only when user speech is detected. Rather, they always 
send their captured signals, based only on those captured signals being within 
a certain frequency or switched between two sources, such as a speaker's 
microphone or an external microphone. Accordingly, the cited references do 
not teach all the elements recited in Claim 72, and thus, do not anticipate that 
claim. Claim 72 is thus, allowable. Dependent Claim 73 depends from Claim 
72, and recites all the limitations therein. Thus, Claim 73 is allowable for that 
reason as well. 

Claim 74 is an independent claim that recites a voice-driven speech 
recognition system that has distributed components comprising a microphone 
system, user speech digitizer, user speech detector, and back-end speech 
recognizer. Claim 74 recites that the system further comprises a headset that 
includes at least one of the microphone system and speech digitizer wherein 
the balance of the components may be contained in one or more devices 
located or removed from the headset. Claim 74 further recites that the headset 
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is configured for transmitting to the one or more devices, the output of the 
microphone system in the form of spectral representations of audio signals that 
are captured by the microphone system. That is, Claim 74 does not send on 
raw audio like the cited references of Naqavasu , et al. or Hauser , but rather 
transmits spectral representations of the audio signals for further back-end 
speech recognition. Claim 74 further recites that the headset includes a user 
speech detector, which is used to at least partially suppress from the 
transmitted output, the spectral representations of audio signals, which do not 
represent user speech. That is, as noted above, the present invention is 
directed to preliminarily evaluating the captured audio signals to determine if 
speech is present, and then selectively transmitting or selectively suppressing 
the transmission of spectral representations of the audio signals if speech is not 
present. Such elements are not at all taught by the cited references, and thus, 
Claim 74 is not anticipated by those references under §102, thus, making Claim 
74 also allowable. Claims 75-84 all depend from Claim 74, and include the 
limitations therein. Thus, those claims would also not be anticipated by the 
references and would be allowable. Furthermore, each of those dependent 
claims recites a unique system, which is not anticipated by the cited art. 
Accordingly, those dependent claims are also in an allowable form. 

Applicants submit that all of the pending claims are allowable and 
respectfully request an indication of their allowability at the Examiner's earliest 
convenience. 



35 



Applicants are submitting the fee due for the one-month extension of 
time with this response. If any additional fees are necessary, the 
Commissioner may consider this to be a request for such and charge any 
necessary fees to deposit account 23-3000. 
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